Skip to content

Conversation

@rozza
Copy link
Member

@rozza rozza commented Oct 9, 2025

Moved specifications submodule out of driver-core into a testing directory.
Deleted old copy of bson specification tests and updated to use the submodule
Fixed Json output for positive exponents to match the the extended json specification

Added test exceptions to BinaryVectorGenericBsonTest
Added BsonBinaryVector prose tests
Added extra regression tests to ensure both explicit and implicit doubles with positive exponets parse as expected.

JAVA-5877
JAVA-5779
JAVA-5782
JAVA-5652

@rozza rozza requested a review from vbabanin October 9, 2025 10:51
@rozza rozza requested a review from a team as a code owner October 9, 2025 10:51
@rozza rozza requested a review from Copilot October 9, 2025 10:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reorganizes specification testing by moving the MongoDB specifications submodule from driver-core to a shared testing directory, updates JSON output for positive exponents to match the extended JSON specification, and adds comprehensive testing for BsonBinaryVector functionality.

  • Moved specifications submodule from driver-core to shared testing directory for better organization
  • Fixed JSON output to include explicit "+" signs for positive exponents in extended JSON
  • Added comprehensive testing for BsonBinaryVector including prose tests and enhanced validation

Reviewed Changes

Copilot reviewed 47 out of 49 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
.gitmodules Updated submodule path from driver-core to testing directory
bson/build.gradle.kts Added test resource processing to include shared testing resources
driver-core/build.gradle.kts Added test resource processing to include shared testing resources
bson/src/main/org/bson/json/*.java Modified double converters to use consistent JSON formatting with explicit positive exponents
bson/src/main/org/bson/BinaryVector.java Added validation logging for non-zero padding bits
bson/src/test/unit/org/bson/vector/BinaryVectorGenericBsonTest.java Enhanced test coverage and updated to use new spec test location
bson/src/test/unit/org/bson/*.java Updated test references to use new specification test helper method
bson/src/test/resources/bson*/ Removed old BSON specification test files (now using submodule)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Moved specifications submodule out of driver-core into a testing directory.
Deleted old copy of bson specification tests and updated to use the submodule
Fixed Json output for positive exponents to match the the extended json specification

Added test exceptions to BinaryVectorGenericBsonTest
Added BsonBinaryVector prose tests
Added extra regression tests to ensure both explicit and implicit doubles with positive exponets parse as expected.

JAVA-5877
JAVA-5779
JAVA-5782
JAVA-5652

@DisplayName("Treatment of non-zero ignored bits: 2. Decoding")
@Test
void decodingWithNonZeroIgnoredBits() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BinaryVectorTest only covers use-cases for creating BinaryVectors from constructor methods of BinaryVector API.

We have a separate class BsonBinaryTest for decoding tests with methods:
shouldDecodeInt8Vector, shouldEncodePackedBitVector. I suggest we move decoding test method there to keep them coherent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this file to BinaryVectorProseTest as these include the new tests in the spec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note there are corpus tests also that cover the BsonBinary api, these prose and extra tests allow for extra testing of the API.

Copy link
Member

@vbabanin vbabanin Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - a couple of clarifications from my side:

Regarding grouping tests:
The extra API-level tests here make sense; I wasn’t suggesting removing them, only that the decoding part might fit more naturally alongside the existing decoding tests BsonBinaryTest since that class already covers decoding scenarios.

About the rename to BinaryVectorProseTest:
As i see, typically we use ProseTest to refer to tests that directly correspond to prose sections in a specification (with a spec reference). In this case we have corpus tests, but (as far as I can tell) no prose-spec requirements.

If that’s correct, then the ProseTest naming might be a bit confusing long-term. Maybe leaving BinaryVectorTest choosing another descriptive name might better reflect the content.

What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved the prose test back to org.bson.vector and added links to the specs and prose tests.

}

tasks.processTestResources {
from("${rootProject.projectDir}/testing/resources")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think, In the future, we could also consider creating a separate module for testing utilities, so shared helpers can be reused across sync, reactive, core, and BSON modules.

@rozza
Copy link
Member Author

rozza commented Oct 21, 2025

@vbabanin I've updated the names of the tests to be more inline with the spec location and to identify the prose tests.

I also moved the validation to the constructors as it seemed the best place to have it.

Copy link
Member Author

@rozza rozza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code


@DisplayName("Treatment of non-zero ignored bits: 2. Decoding")
@Test
void decodingWithNonZeroIgnoredBits() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this file to BinaryVectorProseTest as these include the new tests in the spec.

@rozza rozza requested a review from vbabanin October 21, 2025 10:44
Comment on lines +50 to +58
isTrueArgument("Padding must be between 0 and 7 bits. Provided padding: " + padding, padding >= 0 && padding <= 7);
isTrueArgument("Padding must be 0 if vector is empty. Provided padding: " + padding, padding == 0 || data.length > 0);
if (padding > 0) {
int mask = (1 << padding) - 1;
if ((data[data.length - 1] & mask) != 0) {
// JAVA-5848 in version 6.0.0 will convert this logging into an IllegalArgumentException
LOGGER.warn("The last " + padding + " padded bits should be zero in the final byte.");
}
}
Copy link
Member

@vbabanin vbabanin Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the rationale for moving the validation into the constructor. The reason it wasn’t placed there originally is that we had two different validation paths:

  1. For user input, we throw an IllegalArgumentException (isTrueArgument(...)).
  2. For values coming from the server, an invalid state indicates unexpected and corrupted data, which is why those checks used assert instead (isTrue(...)). See: BinaryVectorHelper.java#L97-L98.

Putting the checks in the constructor means we now perform the same validation twice, which was something we tried to avoid for performance considerations initially, given that these checks could be executed frequently. We haven’t measured the exact impact, but it does introduce extra work.

As a side note:
We still haven’t fully aligned on validation boundaries, whether to validate strictly at API entry points (and accept duplication) or centralize deeper in the code. The driver currently uses both approaches in different areas. This might be worth discussing as a team at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed from the checks from helper to just be in the constructor - as we have a public API BsonVector.packedBitVector that also requires the validation. This removes the double validation but does put the validation deeper in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up reverting that - as we return different errors depending on the scenarion - decoding data or constructing a new instance.

I think the cost will be minor of the duplicated check, however the public API does also require guarding for invalid data.

Its a slightly unusual scenario - we could add a flag to constructor to remove the extra checks but that would also be setting up a new convention.


@DisplayName("Treatment of non-zero ignored bits: 2. Decoding")
@Test
void decodingWithNonZeroIgnoredBits() {
Copy link
Member

@vbabanin vbabanin Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - a couple of clarifications from my side:

Regarding grouping tests:
The extra API-level tests here make sense; I wasn’t suggesting removing them, only that the decoding part might fit more naturally alongside the existing decoding tests BsonBinaryTest since that class already covers decoding scenarios.

About the rename to BinaryVectorProseTest:
As i see, typically we use ProseTest to refer to tests that directly correspond to prose sections in a specification (with a spec reference). In this case we have corpus tests, but (as far as I can tell) no prose-spec requirements.

If that’s correct, then the ProseTest naming might be a bit confusing long-term. Maybe leaving BinaryVectorTest choosing another descriptive name might better reflect the content.

What do you think?

@rozza rozza requested a review from vbabanin November 20, 2025 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants